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Abstract: Generalized t-tests are constructed under weaker than normal con- 
ditions. In the first part of this paper we assume only the symmetry (around 
zero) of the error distribution (i). In the second part we assume that the error 
distribution is a Gaussian scale mixture (ii). The optimal (smallest) critical 
values can be computed from generalizations of Student's cumulative distri- 
bution function (cdf), t n (x). The cdf's of the generalized t-test statistics are 
denoted by (i) t^{x) and (ii) t^(x), resp. As the sample size n —> oo we get 
the counterparts of the standard normal cdf <J>(x): (i) <fr s (a;) := lim n ^oo t^(x), 
and (ii) & G (x) := limn^oo t^{x). Explicit formulae are given for the under- 
lying new cdf's. For example & G (x) = $(x) iff |x| > v3. Thus the classical 
95% confidence interval for the unknown expected value of Gaussian distribu- 
tions covers the center of symmetry with at least 95% probability for Gaussian 
scale mixture distributions. On the other hand, the 90% quantile of $ G is 
4^3/5 = 1.385 ••• > 0~ 1 (O.9) = 1.282. . . . 

1. Introduction 

An inspiring recent paper by Lehmann [9] summarizes Student's contributions to 
small sample theory in the period 1908-1933. Lehmann quoted Student [10]: "The 
question of applicability of normal theory to non-normal material is, however, of 
considerable importance and merits attention both from the mathematician and 
from those of us whose province it is to apply the results of his labours to practical 
work." 

In this paper we consider two important classes of distributions. The first class is 
the class of all symmetric distributions. The second class consists of scale mixtures 
of normal distributions which contains all symmetric stable distributions, Laplace, 
logistic, exponential power, Student's t, etc. For scale mixtures of normal distri- 
butions see Kelker [8], Efron and Olshcn [5], Gneiting [7], Bcnjamini [1]. Gaussian 
scale mixtures are important in finance, bioinformatics and in many other areas of 
applications where the errors are heavy tailed. 

First, let X±, X2, ■ ■ ■ , X n be independent (not necessarily identically distributed) 
observations, and let ji be an unknown parameter with 

Xi = M + « = 1,2, ... ,n, 

where the random errors £i, 1 < i < n are independent, and symmetrically dis- 
tributed around zero. Suppose that 

& = SiVi, i = l,2,...,n, 

where Si,rji i = 1,2, ...,n are independent pairs of random variables, and the 
random scale, Si > 0, is also independent of 77.;. We also assume the r\i variables are 
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identically distributed with given cdf F such that F(x) + F(—x~) = 1 for all real 
numbers x. 

Student's t-statistic is defined as T n — y/n(X — fi)/S, n = 2,3, . . . where X = 
£2=1 Xi/n and S 2 = £HiPQ - X) 2 /(n - 1) ^ 0. 

Introduce the notation 

, nx 2 
a '■= ~o 7 ■ 

x + n — 1 

For a; > 0, 



P{|T„| > x} = P{T n 2 > x 2 } = pj ^t 1 ^ > a 2 j . (1.1) 

(For the idea of this equation see Efron [4] p. 1279.) Conditioning on the random 
scales si, S2, ■ ■ ■ , s n , (|l.lf) becomes 

P{\T n \ >x] =Ep\ ^= lS fl >a 2 \ Sl ,s 2 ,... :Sn )\ 



< sup Pi t^n 1 2 ? > a 



,>0 



2 

^2 



where <ri, C2, . . . , cr n are arbitrary nonnegative, non-random numbers with o~i > 
for at least one i = 1, 2, . . . ,n. 

For Gaussian errors P{|T„| > x} = P(|t n _i| > x) where f ra _x is a t-distributed 
random variable with n — 1 degrees of freedom. The corresponding cdf is denoted 
by t n -i{x). Suppose a > 0. For scale mixtures of the cdf F introduce 

l-i F \(a):=± sup p{ ( ^ g f 2 )2 > a 2 |. (1.2) 

fc=l,2,...,Tl V ' 

For a < 0, 4-i( a ) : = 1 - ^n-ii-a)- :t is clear that if 1 - 4-i( a ) ^ a A thcn 
P{|T n | > < a. This is the starting point of our two excursions. 

First, we assume F is the cdf of a symmetric Bernoulli random variable supported 
on ±1 (p = 1/2). In this case the set of scale mixtures of F is the complete set 
of symmetric distributions around 0, and the corresponding t is denoted by t s 
(t%(x) = tn(%) when F is the Bernoulli cdf). In the second excursion we assume 
F is Gaussian, the corresponding t is denoted by t G . 

How to choose between these two models? If the error tails are lighter than the 
Gaussian tails, then of course we cannot apply the Gaussian scale mixture model. 
On the other hand, there are lots of models (for example the variance gamma 
model in finance) where the error distributions are supposed to be scale mixtures 
of Gaussian distributions (centered at 0). In this case it is preferable to apply 
the second model because the corresponding upper quantiles are smaller. For an 
intermediate model where the errors are symmetric and unimodal see Szekely and 
Bakirov [11]. Here we could apply a classical theorem of Khinchin (see Feller [6]); 
according to this theorem all symmetric unimodal distributions are scale mixtures 
of symmetric uniform distributions. 
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2. Symmetric errors: scale mixtures of coin nipping variables 

Introduce the Bernoulli random variables £j, P(si = ±1) = 1/2. Let V denote the 
set of vectors p = (pi,P2, ■ ■ ■ ,Pn) with Euclidean norm 1, Y^k^iPk = 1- Then, 
according to (|1.2I) , if the role of rji is played by Si with the property that e? = 1, 

1 _ *n-l( a ) = SU P P {Pl £ l + P^2 H h Pn^n > a} . 

per 

The main result of this section is the following. 
Theorem 2.1. For < a < y/n, 2~^ 2 1 < i _ ^_ 1 ( a ) = |£ , 



where m is the maximum number of vertices v = (±1, ±1, . . . , ±1) o/ i/ie 
n- dimensional standard cube that can be covered by an n-dimensional closed sphere 
of radius r = \Jn — a? . (For a > ^fn, 1 — (a) = 0.) 

Proof. Denote by V a the set of all rt-dimensional vectors with Euclidean norm a. 
The crucial observation is the following. For all a > 0, 

n 

1 ~ tn-l( a ) = SU P p { Y\Pi E i > a 

pep 1^ 

(2.1) 

sup P{) Yej - pj) 2 <n — a 2 
Here the inequality Y^j=i( £ j ~ Pj) 2 < n — a 2 means that the point 

V = (£i,£2, • • • , £n), 

a vertex of the n dimensional standard cube, falls inside the (closed) sphere G(p, r) 
with center p £ V a and radius r = \Jn — a 2 . Thus 

l-4(a) = -, 



where m is the maximal number of vertices v = (±1,±1,...,±1) which can be 
covered by an n-dimensional closed sphere with given radius r = y/n — a 2 and 
varying center p G V a . It is clear that without loss of generality we can assume 
that the Euclidean norm of the optimal center is a. 

If k > is an integer and a 2 < n — k, then m > 2 k because one can always find 
2 k vertices which can be covered by a sphere of radius V~k. Take, e.g., the vertices 

n — k k 



(1,1,1,...,1,±1,±1,...,±1), 

and the sphere G(c, vfe) with center 



With a suitable constant < C < 1, p = Cc has norm a and since the squared 
distances of p and the vertices above are kC < k, the sphere G(p, Vk) covers 2 k 
vertices. This proves the lower bound 2 1 < 1 — if (a) in the Theorem. Thus the 
theorem is proved. □ 
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Remark 1. Critical values for the i s -test can be computed as the infima of the 
x- values for which t^_ x (J n ™l x * } < a. 

Remark 2. Define the counterpart of the standard normal distribution as follows. 

$ s (a) d = f lim 4(a). 

n — >oc 

Theorem 1 implies that for a > 0, 

1 -2~^ < $ s (a). (2.2) 

Our computations suggest that the upper tail probabilities of <& s can be ap- 
proximated by 2~T a 1 go W ell that the .9, .95, .975 quantiles of $ s are equal 
to 2, y/E, resp. with at least three decimal precision. We conjecture that 
$ S (V3) = .9, $ s (2) = .95, $ S (V5) = .975. On the other hand, the .999 and higher 
quantiles almost coincide with the corresponding standard normal quantiles, thus 
in this case we do not need to pay a heavy price for dropping the condition of 
normality. On this problem see also the related papers by Eaton [2] and Edelman 
[3]. 

3. Gaussian scale mixture errors 

An important subclass of symmetric distributions consists of the scale mixture 
of Gaussian distributions. In this case the errors can be represented in the form 
£j = SiZi where Sj > as before and independent of the standard normal Z±. We 
have the equation 

!-£_!(«)= sup p{ ?.fl +<ra f^- + <r "f" >4- (3-D 



fe=l,2.. 



y/aiZ-t + ajZ-j + ... + a*Z* 



Recall that a 2 = —^--^ and thus x = J a2( "" 1} 



Theorem 3.1. Suppose n > 1. Then for < a < 1, i£_i(a) = 1/2, i^L^l) = 3/4, 
for a > *fn, t^_ 1 (a) = 1, and finally, for 1 < a < ^/n, 

l-t° 1 (a) = max p( Z ^ + Z 2 + -- + Zk ) 



a 2 (k-l) 

= max P t k -x > ■ 

a 2 <k<n \ 

where tk-i is a t- distributed random variable with k — 1 degrees of freedom. 

The point of this theorem is that sup CTi CT2 a ^ in l|3.1l) is taken when all nonzero 
o~'s are equal and here the number of zeros depends on a. For details see Szekely 
and Bakirov [11]. 

Compute the intersection points of the curves 
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for two neighboring indices. We get the following equation 



» 2 (fe-i) 



2r(|) f\[ 



fc+i 
2 , 



for the intersection point A(k). It is not hard to show that lim^oo A(k) = \/3. 
This leads to the following: 

Corollary 1. There exists a sequence A(l) := 1 < A(2) < A(3) < • • • < A(k) — ► 
(i) /or a e [A(fc- l),A(fc)], fc = 2, 3, . . . , n - 1, 



k — a? 



(ii) for a>\/3 that is for x > v /3(n - l)/(n - 3), 

*n-i(«)=*»-i(«)- 

The most surprising part of Corollary 1 is of course the nice limit, This shows 
that above y/3 the usual t-test applies even if the errors are not necessarily normals 
only scale mixtures of normals. Below V3, however, the 'robustness' of the t-test 
gradually decreases. Splus can easily compute that A(2) = 1.726, A(3) = 2.040. 
According to our Table 1, the one sided 0.025 level critical values coincide with the 
classical t-critical values. 

Recall that for x > 0, the Gaussian scale mixture counterpart of the standard 
normal cdf is 

<S> G (x) := lim t%{x) (3.2) 

n— »oo 

(Note that in the limit, as n — > oo, we have a = x if both are assumed to be 
nonnegative; $ G (— x) = 1 — $ G (a;).) 

Corollary 2. For < x < 1, <Z> G (x) = .5, $ G (1) = .75, and for x > V3, $ G (x) = 
$(x), where $(x) is the standard normal cdf (§ G (\/3) = = 0.958J. 

For quantiles between .5 and .875 the max in Theorem 3.1 is taken at fc = 2 
and thus in this interval <3? G (x) = C[xj \J (2 — x 2 )), where C(x) is the standard 
Cauchy cdf. This is the convex section of the curve $ G (x), x > 0. Interestingly the 
convex part is followed by a linear section: $ G (a;) = x/{2\f$) + 1/2 for 1.3136 • • • < 
x < 1.4282 .... Thus the 90% quantile is exactly 4v / 3/5 : $ G (4\/3/5) = 0.9. 
The following critical values are important in applications: 0.95 = $(1,645) = 
$ G (1.650), 0.9 = $(1,282) = $ G (1.386), 0.875 = $(1,150) = $ G (1.307) (see the 
last row of Table 1). 

Remark 3. It is clear that for a > we have the inequalities t n (a) > t G (a) > t^(a). 
According to Corollary 1, the first inequality becomes an equality iff a > \/3. In 
connection with the second inequality one can show that the difference of the a- 
quantiles of t G (a) and t% (a) tends to as a — > 1. 
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Table 1 

Critical values for Gaussian scale mixture errors 
computed from tf*(W nx 2 / (n — 1 + x 2 )) = a 



n—1 


0.125 


0.100 


0.050 


0.025 


2 


1 


625 


1 


886 


2 


920 


4.303 


3 


1 


495 


1 


664 


2 


353 


3.182 


4 


1 


440 


1 


579 


2 


132 


2.776 


5 


1 


410 


1 


534 


2 


015 


2.571 


6 


1 


391 


1 


506 


1 


943 


2.447 


7 


1 


378 


1 


487 


1 


895 


2.365 


8 


1 


368 


1 


473 


1 


860 


2.306 


9 


1 


361 


1 


462 


1 


833 


2.262 


10 


1 


355 


1 


454 


1 


812 


2.228 


11 


1 


351 


1 


448 


1 


796 


2.201 


12 


1 


347 


1 


442 


1 


782 


2.179 


13 


1 


344 


1 


437 


1 


771 


2.160 


14 


1 


341 


1 


434 


1 


761 


2.145 


15 


1 


338 


1 


430 


1 


753 


2.131 


ID 


1 


336 


1 


427 


1 


746 


o 1 on 


17 


1 


335 


1 


425 


1 


740 


2.110 


18 


1 


333 


1 


422 


1 


735 


2.101 


19 


1 


332 


1 


420 


1 


730 


2.093 


20 


1 


330 


1 


419 


1 


725 


2.086 


21 


1 


329 


1 


417 


1 


722 


2.080 


22 


1 


328 


1 


416 


1 


718 


2.074 


23 


1 


327 


1 


414 


1 


715 


2.069 


24 


1 


326 


1 


413 


1 


712 


2.064 


25 


1 


325 


1 


412 


1 


709 


2.060 


100 


1 


311 


1 


392 


1 


664 


1.984 


500 


1 


307 


1 


387 


1 


652 


1.965 


1,000 


1 


307 


1 


386 


1 


651 


1.962 



Our approach can also be applied for two-sample tests. In a joint forthcoming 
paper with N. K. Bakirov the Behrens-Fisher problem will be discussed for Gaussian 
scale mixture errors with the help of our t^(x) function. 
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